A machine learning approach to support triaging of primary versus secondary headache patients using complete blood count

Headaches account for up to 4.5% of emergency department visits, where they present a significant diagnostic challenge. While primary headaches are benign, secondary headaches can be life-threatening. It is essential to rapidly differentiate between primary and secondary headaches as the latter require immediate diagnostic work-up. Current assessment relies on subjective measures; time constraints can result in overuse of diagnostic neuroimaging, prolonging diagnosis, and adding to economic burden. There is therefore an unmet need for a time- and cost-efficient, quantitative triaging tool to guide further diagnostic testing. Routine blood tests may provide important diagnostic and prognostic biomarkers indicating underlying headache causes. In this retrospective study (approved by the UK Medicines and Healthcare products Regulatory Agency Independent Scientific Advisory Committee for Clinical Practice Research Datalink (CPRD) research [20_000173]), UK CPRD real-world data from patients (n = 121,241) presenting with headache from 1993–2021 were used to generate a predictive model based on a machine learning (ML) approach for primary versus secondary headaches. A ML-based predictive model was constructed using two different methods (logistic regression and random forest) and the following predictors were evaluated: ten standard measurements of complete blood count (CBC) test, 19 ratios of the ten CBC test parameters, and patient demographic and clinical characteristics. The model’s predictive performance was assessed using a set of cross-validated model performance metrics. The final predictive model showed modest predictive accuracy using the random forest method (balanced accuracy: 0.7405). The sensitivity, specificity, false negative rate (incorrect prediction of secondary headache as primary headache), and false positive rate (incorrect prediction of primary headache as secondary headache) were 58%, 90%, 10%, and 42%, respectively. The ML-based prediction model developed could provide a useful time- and cost-effective quantitative clinical tool to facilitate the triaging of patients presenting to the clinic with headache.


Introduction
Headache is a common nervous system disorder, affecting approximately 50% of the general population [1,2]. It can be classified as a primary headache disorder, typically migraine, tension, or cluster headaches [1], or a secondary headache disorder, which includes: giant cell arteritis, meningitis/encephalitis, subarachnoid hemorrhage (SAH), cerebral venous thrombosis, idiopathic intracranial hypertension, brain tumor, and ischemic stroke [1,3,4]. Although secondary headaches are rare compared with primary headaches, they are extremely important to recognize as they can require immediate intervention [1,3], with further and rapid diagnostic evaluation including neuroimaging and lumbar puncture often needed [3,5].
In primary care, headache is one of the most common presenting symptoms, with most due to primary causes, although many headache sufferers do not receive a specific diagnosis [6]. Headaches also account for up to 4.5% of all visits to the emergency department (ED) [2,3,5,7,8], with one observational study revealing an equal prevalence of primary versus secondary headache etiologies presenting to the ED (48 vs. 52%, respectively) [2]. Thus, accurate diagnosis of the underlying cause of headache and treatment initiation can be critical.
Headache has a historic reputation as being one of the most poorly classified neurologic disorders, and the International Headache Society classification system was developed to provide a hierarchy of criteria for diagnosis [9]. In the United Kingdom (UK) primary care setting, recognition of the importance of effective headache diagnosis, and management has prompted the additional training of general practitioners (GPs) with special interest in headache, and the establishment of headache clinics in general practice [10,11]. Despite this, a varied degree of confidence in headache diagnosis and treatment still exists amongst emergency and primary clinicians, due in part to a lack of specific experience in neurology and poor use of the International Headache Society classification system [2,6,10]. A study conducted in Canada reported that misdiagnosis or diagnostic uncertainty occurred in more than one-third (35.7%) of cases of neurologic complaints in the ED, when comparing the initial diagnosis to the diagnosis made by the consulting neurologist [12]. In the UK, where there is a concerning shortfall of neurologists [10], the fear of clinical neurology by doctors outside the discipline has been described to amount to 'neurophobia' [10]. This can, together with patient anxiety and legal concerns, result in multiple appointments and unnecessary investigations, which may subsequently increase diagnosis times, healthcare costs, and economic burden [1,4,13].
The effective triaging of patients presenting with primary versus secondary headache is an important and currently unmet need [1,5,14]. Consideration of patients' medical history and physical examination are currently the most important aspects of headache assessment, and clinicians must be vigilant for "red flag" symptoms that are characteristic of serious secondary causes [1,3]. This qualitative approach could be complimented by a quantitative methodologic tool, which could potentially streamline diagnosis, facilitate clinical decision-making, and reduce unnecessary investigations [2,5,14].
A complete blood count (CBC) is one of the most commonly requested blood tests [15] and its results may provide important diagnostic and prognostic biomarkers indicating underlying causes of headache. Several CBC parameters have been investigated for their ability to distinguish between primary and secondary headaches. A retrospective study of patients presenting to the ED with headache reported that leukocytosis or an increase in the percentage of polymorphonuclear leukocytes (PMNs) had a sensitivity of 89.8%, a specificity of 46.7%, a positive predictive value of 82.1%, and a negative predictive value of 62.8% for diagnosing SAH within 6 or 12 hours of ED admission [16]. The ratio of neutrophils to lymphocytes (NLR) has also received increasing attention as a diagnostic and prognostic marker of inflammation and can be easily calculated from standard measurements of CBC tests. A retrospective study found UK Medicines and Healthcare products Regulatory Agency (MHRA); however, the interpretation and conclusions contained in this report are those of the author(s) alone. The data used in this study are third-party data (i.e., data not owned or collected by the author(s)) provided by the CPRD under a license with the UK MHRA. A sub-license agreement/third-party agreement is required for data access by third parties (e.g., journal editors, reviewers, etc.) who were not included in the research team of the original study protocol approved by the CPRD's Research Data Governance process. The authors did not have any special access privileges that others would not have. Others can apply for access to the raw data (i.e., the data used to generate the relevant datasets for the current study) at the following URL: https:// cprd.com/data-access/.

Funding:
The study and third-party medical writing assistance was funded by Roche Diagnostics International Ltd (Rotkreuz, Switzerland). The funder (Roche Diagnostics International Ltd) through its employees at the time of the study (FY, TM, BT-N, CM, CL, and ED) had an active role in the design and conduct of the study; management, analysis, and interpretation of the data; preparation and review of the manuscript; and decision to submit the manuscript for publication.
Competing interests: I have read the journal's policy and the authors of this manuscript have the following competing interests: FY, TM, BT-N and CM are employees of Roche. CL and ED were employees of Roche at the time the study was conducted. This does not alter our adherence to PLOS ONE policies on sharing data and materials.
NLR to be higher in people presenting to the ED with a migraine attack versus people without a headache [17]. A further retrospective, single-center analysis in ED patients presenting with headache accompanied by nausea and vomiting, found that NLR could distinguish between those with migraine and those with SAH [16]. Although this retrospective study involved a limited number of clinical cases, median NLR values were found to be significantly higher in patients with SAH compared with those with migraine and other headaches (both p < 0.001) [18].
Measurements derived from CBC tests have also been evaluated in the prediction of secondary headache severity. In one retrospective, single-center review, patients with SAH whose leukocyte count was >15 x 10 9 /L during admission were more than three times more likely to develop vasospasm [19]. In another study of patients with SAH, the authors reported that patients admitted with spontaneous SAH with a leukocyte count of >20,000 had a mortality rate of 50% [20]. Furthermore, mean platelet volume and platelet distribution width has been shown to be increased in patients with cerebral venous thrombosis and brain parenchymal lesions compared with patients with cerebral venous thrombosis without lesions [21].
Aligning with the concept of using routine blood test results as a triaging tool to assist physicians when deciding whether to perform neuroimaging on patients presenting with severe headache, this study was designed to evaluate a machine learning (ML)-based approach to classify primary versus secondary headache using real-world data (RWD) derived from a large patient group.

Study design
This was a retrospective, observational study of RWD from patients presenting with a complaint of headache to the clinic. Due to the limitation of accessing RWD in the ED and evaluating the suitability to the study objectives, primary care data from the UK Clinical Practice Research Datalink (CPRD) were used.

Database
The CPRD includes longitudinal electronic health records generated by GP practices in the UK. The current study included patients from two CPRD databases [22][23][24]: CPRD GOLD (using a data-cut of July 2021), and CPRD Aurum (using a data-cut of June 2021).

Study population
Eligible patients with a record of presenting to a GP practice with a complaint of headache were identified in the CPRD GOLD and CPRD Aurum databases separately, according to the list of codes provided in S1-S10 Tables. The date of the patient's first clinic visit with headache symptoms was defined as the index date. Patients were included for analysis if they: 1) received a diagnosis of either primary or secondary headache within a 30-day window after the index date, 2) had laboratory results available for all ten specific parameters from the CBC test within a 30-day window after the index date and 3) had data classed as of "research acceptable" quality by the CPRD. Qualifying patients from both CPRD GOLD and CPRD Aurum were then merged to form an analytical dataset. To avoid possible duplicate individuals in the analytical dataset, the following steps were employed: 1) if a patient had multiple sets of records that fulfilled the inclusion criteria, only data from their latest index date were included; and 2) for clinics that changed enrolment from CPRD GOLD to CPRD Aurum, qualifying patients were only identified during the period that the clinic was enrolled for CPRD Aurum and not for CPRD GOLD.
The study was conducted in accordance with the principles founded in the Declaration of Helsinki of 1975 (revised 2013). The study was approved by the UK Medicines and Healthcare products Regulatory Agency Independent Scientific Advisory Committee for CPRD research (20_000173). All patient data were anonymized; thus, the requirement for patient consent was waived. Individual patients can opt out of sharing their records with the CPRD. An overview of the study is available online [25].

Outcome definitions and study variables
Primary headaches were defined as migraine, tension-type headache, and cluster headache, while secondary headaches were defined as those caused by ischemic stroke, cerebral venous thrombosis, hemorrhage (including SAH), arteritis, and angiitis (S1-S6 Tables). In the event that a patient had diagnoses contributing to both primary and secondary headaches during the same episode, the patient was categorized as having secondary headache.

Descriptive analysis
First, patients with primary headaches and secondary headaches included in the analytical cohort were described, and the differences between the two groups were assessed using t-test for continuous variables and chi-squared test for categorical variables. The Hotelling T 2 -test was then used to examine whether data from the primary and secondary headache groups could be differentiated based on several collective variables, and was performed using the T2. test function from the R-project for statistical computing [26].
Data from patients with outlier results (the top and bottom 1% of the values) for any of the ten CBC parameters and body mass index, were excluded from the analysis.

Prediction model development, performance, and validation
To develop the prediction model for primary headaches versus secondary headaches, two separate ML-based approaches (logistic regression and random forest) were used and evaluated. A total of 31 candidate predictors for the prediction models were preselected, including: demographic variables (age and sex), ten parameters from the CBC test (blood cell counts) and 19 variables derived from the ratios of the ten parameters from the CBC test. Earlier reports indicate that variables comprising specific blood cell ratios are medically relevant [17,18].
Data normalization and optimization was performed by min-max scaling to standardize the scale of CBC-related values included in the model (preliminary tests showed that min-max scaling outperformed z-score scaling). No data balancing technique was used, rather, all available data in the analytical cohort were used.

Model construction
The ability of the blood cell count ratios to predict headache type was determined by comparing the predictive performance metrics of each prediction model with and without the 19 variables derived from the ratios of the ten parameters from the CBC test. As part of the model specification, the following feature selection techniques were assessed with the aim of simplifying the model while maintaining good performance: lasso regularization [27], recursive feature elimination [28], weight of feature importance (analyzed by logistic regression), and correlation analysis for any candidate predictor with headache type.
For evaluation of the prediction models developed using logistic regression and random forest, five-fold cross-validation was used to obtain the mean and standard deviation for each of the following model performance metrics: 1) accuracy; 2) balanced accuracy; 3) average precision; 4) F1-score; and 5) area under the curve. Since the data were imbalanced and secondary headache is more serious than primary headache, the model was tuned by changing the predicted probability threshold cut off from 0.5 to 0.3. This was done with the aim of finding a prediction model that balanced the correct detection of secondary headaches with a false negative rate (i.e. incorrect prediction of secondary headache as primary headache) of less than 10%, whilst minimizing overcalling of primary headache as secondary headache. The final prediction model was then evaluated using a single 80:20 train/test dataset split. To visualize and summarize the performance of the prediction model, a confusion matrix was generated.

Cohort description
A total of 121,241 patients satisfied the inclusion/exclusion criteria and formed the analytical cohort, comprising 108,906 patients with primary headache and 12,335 patients with secondary headache (Fig 1). Baseline demographic and clinical characteristics of the analytical cohort, stratified into primary and secondary headache groups, are presented in Table 1. Overall, patients presenting with primary headache were significantly younger than those presenting with secondary headache (mean 44 vs. 70 years; p < 0.001). The majority of primary headache patients were between 21 and 50 years of age (n = 64,004; 58.7%), whereas the majority of secondary headache patients were between 61 and 80 years of age (n = 6,454; 52.3%). In both headache groups, there were considerably more female than male patients (78.1% vs. 21.9% and 65.7% vs. 34.3%, for primary and secondary headache groups, respectively).

Descriptive statistical analyses
Results from descriptive statistical analyses revealed statistically significant differences (p < 0.001) between the population means for almost all of the ten parameters from the CBC tests and the 19 ratio variables measured in patients between headache groups (Tables 2 and  3). However, there were substantial overlaps in the range of values for these variables with broadly similar mean values (S1 and S2 Figs). Further analysis using the Hotelling T 2 -test showed that the two headache groups could not be properly differentiated (Table 4). Table 5 shows five-fold cross-validated performance metrics of the prediction models, with and without the ratio variables of blood cell count parameters before varying the probability threshold, using logistic regression and random forest methods separately. The performance metrics suggested that the random forest method without the blood cell count ratio variables had an overall better predictive performance for predicting primary headaches versus secondary headaches. Spearman's correlation matrix and feature weight analysis suggested that age group, sex, total WBC count, neutrophil, and monocyte count correlated strongly with headache type (S3 Fig).

Predicting primary headaches versus secondary headaches
After changing the probability threshold, the final prediction model showed accurate and robust performance with a balanced accuracy at 0.7405 (Table 6), reflecting an ability to achieve a false negative rate (incorrect prediction of secondary headache as primary headache) of 10% while maintaining a false positive rate (incorrect prediction of primary headache as secondary headache) of 42% (Fig 2). The sensitivity (correct prediction of secondary headache as  secondary headache) and specificity (correct prediction of primary headache as primary headache) of the final model were 58% and 90%, respectively (Fig 2).

Discussion
In summary, descriptive statistical analysis revealed substantial overlap in values for the ten CBC parameters and 19 ratio variables measured in patients in both headache groups, and patients with primary and secondary headaches could not be properly differentiated based on results from the Hotelling T 2 -test. However, we have developed a ML-based prediction model with a modest predictive accuracy to differentiate between primary and secondary headaches on the basis of readily available patient characteristics and routine blood test results. Diagnostic procedures and acute treatment for headaches may vary across different countries, depending on factors such as catchment area, structure of the care facility, in-house protocols, and local medical staff [3]. In current clinical practice, an excess of patients presenting to the ED with a severe headache are referred for neuroimaging, despite current guidelines recommending against routine neuroimaging for headaches [33]. In one European study, neurologic examination was performed in 72.5% of patients presenting to the ED with headache; 60.9% subsequently underwent technical investigation and 53.2% had non-contrast cranial computed tomography [3]. However, unnecessary investigations should be avoided [1]; it is not appropriate, for example, to routinely use computed tomography in headache patients, because of the high cost and radiation exposure [16].
Our study suggests that a CBC-based ML algorithm could mitigate this problem, by simplifying the triage of patients who require such diagnostic procedures. Our findings indicate that age group, sex, and ten parameters that are usually collected during CBC tests represent convenient, measurable variables for use in a ML prediction model, to differentiate patients with primary and secondary headache. We also developed a prediction model with modest performance, predicting almost 60% of secondary headache patients and 90% of primary headache patients. Future studies may look at including certain clinical characteristics (such as a history of brain trauma, hypertension, epilepsy, or stroke) in the prediction model to assess whether their addition could improve its performance further.
In a similar study, the sensitivity of leukocytosis or increase in the percentage of PMNs in cases of patients with SAH was investigated, with a view to developing a non-invasive blood test to facilitate diagnosis [16]. Investigators concluded that CBC had an excellent sensitivity (89.8%) in the exclusion of SAH in non-traumatic headache. Specificity, however, was poor (46.7%), as leukocytosis can result from other headache etiologies such as migraine, temporal arteritis (giant cell arteritis), and hypertension, suggesting CBC levels could only be used to rule out, rather than confirm, SAH [16].
Our ML model has potentially important clinical implications. The reasonably low error rate (10%) of misclassifying secondary headache as primary headache could aid clinicians'  Only patients who had non-zero values for all ten parameters were included, so this analysis was performed on data from n = 67,974 patients. Patients with outlier results (top and bottom 1% of the values) on any parameter of the blood test and BMI were removed prior to deriving ratio variables. b p values were derived from t-tests to examine the difference between the group means for the primary and secondary headache groups.
https://doi.org/10.1371/journal.pone.0282237.t003 decision-making on which patients to refer for further examination (e.g. neuroimaging and/or lumbar puncture), thereby avoiding unnecessary procedures and reducing the drain on healthcare resources. It is envisaged that our model could be used alongside taking a detailed headache history and, if indicated, a thorough neurologic examination. In cases of abnormal findings, neuroimaging should be performed to rule-out secondary headaches. Currently, if there is a clinical suspicion of SAH, computed tomography is undertaken, followed by lumbar puncture if the scan is inconclusive [34]. As our model is able to distinguish between primary and secondary headaches with a sensitivity of 58%, a specificity of 90%, a false negative rate of 10% and false positive rate of 42%, it is hoped that it may help reduce the number of unnecessary procedures. Furthermore, the model will be of particular use in countries where neuroimaging is not readily available, but should be used with caution, particularly when headache history is sparse. As a point to consider, because primary headache accounted for approximately 90% of all headache cases in this study, a default prediction model with good prediction accuracy may have been informed by an imbalanced data set (i.e. a model that predicts primary headache for all patients will be correct in 90% of cases). However, the model developed herein accounted for such imbalanced data and focused on the accurate prediction of secondary headaches, resulting in a modest predictive model with a balanced accuracy of 0.7405. Furthermore, as healthcare progresses, in the future our model could be expanded to include other relevant parameters to further improve the model performance.
When comparing random forest and logistic regression methods, the former marginally outperformed the latter in nearly all prediction performance metrics, particularly those suitable for imbalanced data sets (balanced accuracy, average precision, and F1-score). This is Table 5. Performance metrics of the prediction models using logistic regression and random forest methods with five-fold cross-validation. likely due to the fact that the random forest model is fundamentally a large number of uncorrelated individual decision trees operating as an ensemble/committee and is hence better suited to capture interactions between variables. The logistic regression model on the other hand, has a linear function and cannot capture such interactions. Correlated features have an influence on the prediction performance of the model, which was suggested by performance metrics to be slightly compromised by the features of blood cell ratios. Although this is counterintuitive, according to the premise that "more data is better", data are required to be independent and identically distributed, and it follows that such correlations are detrimental for the performance of ML techniques. The features of blood cell ratios are highly inter-dependent, being strongly correlated both to each other and to the original features of blood cell counts, thus violating this data premise.

Without blood cell count ratios With blood cell count ratios Without blood cell count ratios With blood cell count ratios
We attempted to simplify the model, whilst maintaining good performance, by using feature selection techniques, including lasso regularization and recursive feature elimination. As correlation and feature weight analyses indicated that age group, total WBC count, monocyte count, and neutrophil count were of the greatest significance for the model, only these features were included. However, none of the ML feature selection techniques yielded a better result compared with the final model that included age group, sex, and laboratory results of ten parameters usually collected during a CBC test. This is probably due to the final prediction model demonstrating only a marginal improvement when compared with the ML selection techniques focusing on a certain metric, e.g. accuracy. In addition, many simpler models with fewer features resort to the baseline "guess" model of predicting every patient as primary headache regardless of the data.
The key strength of this study is the large sample size of >120,000 patients and the use of RWD from the UK. Awareness in the UK of the difficulties around headache diagnosis and treatment has prompted the training of GPs with special interest in headache [10] and the establishment of a network of headache centers [11]; our findings could potentially inform these endeavors.
The use of primary care data rather than data directly from ED settings is a limitation of the study, due to the extrapolation performed, which makes it difficult to assess the real-life advantages of the ML approach to differentiating headache types in the ED environment. Further studies using ED data are warranted to validate the algorithm described here for this differentiation. This study considered primary and secondary headaches as two groups of heterogeneous conditions; future work could evaluate diagnostic accuracy of measurements from CBC tests in different forms of primary and secondary headache. As this was a retrospective study of anonymized data some patients included in the study may, for example, have comorbidities, such as an underlying infection or inflammatory condition, or may be using medications that alter certain CBC markers. Further research should be carried out in specific patient populations, such as immunocompromised individuals, to elucidate the potential prognostic value of CBC and CBC-derived ratio parameters in differentiating primary and secondary headaches.
In conclusion, this study demonstrated the use of a ML approach to create a prediction model with a modest level of performance to differentiate patients with primary and secondary headache in clinical settings.
Supporting information S1 Table. List of read codes used to identify diagnosis of primary headache disorders in CPRD GOLD. NOS, not otherwise specified.